Stochastic perceptual speech models with durational dependence
نویسندگان
چکیده
In [6], we develop statistical model of speech recognition where emphasis is placed on the perceptually-relevant and information-rich portion of the speech signal. In that model, speech is viewed as a sequence of elementary decisions or Auditory Events (avents) that are made in response to loci of signi cant spectral change. These decision points are interleaved with periods during which insu cient information has been accumulated to make the next decision. We have called this a Stochastic Perceptual Avent Model, or SPAM. In the work reported here, we have extended our initial experimental implementation [7] to include other probabilistic dependencies speci ed in the original theory, particularly the dependence on the time from the current frame back to the previous hypothesized avent.
منابع مشابه
Perceptual effects of preceding nonspeech rate on temporal properties of speech categories.
The rate of context speech can influence phonetic perception. This study investigated the bounds of rate dependence by observing the influence of nonspeech precursor rate on speech categorization. Three experiments tested the effects of pure-tone precursor presentation rate on the perception of a [ba]-[wa] series defined by duration-varying formant transitions that shared critical temporal and ...
متن کاملRhythmic variability between some asian languages: results from an automatic analysis of temporal characteristics
The rhythmic organization of speech can vary between languages. In the present research we studied rhythmic variability between Mandarin, Cantonese and Thai using automatically retrieved prosodic temporal characteristics from read speech. We measured the variability of intervals between amplitude peaks in the amplitude envelope (<10 Hz) and the durational characteristics of intervals with and w...
متن کاملSuprasegmental duration modelling with elastic constraints in automatic speech recognition
In this paper a method of integrating a model of suprasegmental duration with a HMM-based recogniser at the post-processing level is presented. The N-Best utterance output is rescored using a suitable linear combination of acoustic log-likelihood (provided by a set of tied-state triphone HMMs) and duration log-likelihood (provided by a set of durational models). The durational model used in the...
متن کاملPerceptually Informed Quantification of Speech Rhythm in Pairwise Variability Indices
Two previous experiments demonstrated that f0 and duration are interdependent in the perception of rhythmic groups in speech and sentence rhythmicality, and that the relative weighting of tonal and durational cues depends on listeners' native language. The listeners were native speakers of Swiss German, Swiss French, or Metropolitan French (i.e. from France). The experiment reported here invest...
متن کاملClassification of Filipino Speech Rhythm Using Computational and Perceptual Approach
This study incorporates computational and perceptual methods to classify Filipino speech rhythm. Speech rhythm may be described as a language‟s distinguishing durational sound pattern, resulting from the complexity of the language‟s syllable inventory. 1 Computational methods involve the correlation of rhythm-types to acoustic features such as the vocalic and consonantal intervals, one of which...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1996